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In re application of: 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 




HOTTEN et al. 



Serial No.: unknown 



Filed: August 20, 1999 



For: GROWTH/DIFFERENTIATION FACTORS OF THE TGF-p FAMILY 

PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 

Washington, D.C. 20231 August 25, 1999 

Sir: 

Prior to calculation of the filing fee and prior to the examination of this 
application, please amend the above-identified application as follows: 

IN THE CLAIMS : 

Kindly cancel claims 1-19 without prejudice or disclaimer. 
Please add the following new claims to the application. 

-20. An antibody or antibody fragment which specifically binds to a protein of the 
TGF-B family wherein said protein is encoded by a DNA comprising a nucleotide 
sequence selected from the following group: 

(a) the nucleotide sequence as shown in SEQ ID NO:1, 

(b) a nucleotide sequence which is degenerate as a result of the genetic code to 
the nucleotide sequence of (a), and 

(c) fragments of (a) or (b) which encode a protein which has essentially the same 




cartilage or bone inducing activities as a mature protein encoded by the nucleotide 
sequence of SEQ ID NO:1 . 

21. The antibody according to claim 20, wherein said antibody is a monoclonal 
antibody. 

22. An antibody or antibody fragment according to claim 20, which specifically 
binds to a protein of the TGF-ft family wherein said protein comprises the amino acid 
sequence according to SEQ ID NO:3. 

23. The antibody according to claim 22, wherein said antibody is a monoclonal 
antibody. 

24. An antibody or antibody fragment which specifically binds a protein of the 
TGF-fc family, wherein said protein is encoded by a DNA comprising a nucleotide 
sequence selected from the following group: 

(a) the nucleotide sequence as shown in SEQ ID NO:2, 

(b) a nucleotide sequence which is degenerate as a result of the genetic code to 
the DNA of (a), 

(c) a nucleotide sequence which hybridizes under the following stringent 
hybridization conditions to the DNA in (a), or (b): hybridization at a salt concentration of 
4X SSC at 62°-66°C followed by a one-hour wash with 0.1X SSC and 0.1% SDS at 
62°-66°C, and 




(d) fragments of (a), (b) or (c) which encode a protein which has essentially the 
same cartilage or bone inducing activity as a mature protein encoded by the nucleotide 
sequence of SEQ ID NO:2. 

25. An antibody or antibody fragment according to claim 24, wherein said protein 
comprises the amino acid sequence according to SEQ ID NO:4. 

26. The antibody according to claim 25, wherein said antibody is a monoclonal 
antibody. 

27. The antibody according to claim 24, wherein said antibody is a monoclonal 
antibody. 

28. A method for detecting a protein of the TGF-fi family, 

comprising incubating an antibody or antibody fragment which specifically binds 
to a protein of the TGF-B family with a sample suspected of containing said protein, and 

detecting any antibody/protein complex formed as an indication of the presence 
of said protein, 

wherein said protein is encoded by a DNA comprising a nucleotide sequence 
selected from the following group: 

(a) the nucleotide sequence as shown in SEQ ID NO:1 , 

(b) a nucleotide sequence which is degenerate as a^esult of the genetic code to 
the nucleotide sequence of (a), and 
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(c) fragments of (a) or (b) which encode a protein which has essentially the same 
cartilage or bone inducing activities as a mature protein encoded by the nucleotide 
sequence of SEQ ID NO:1. 

29. A method for detecting a protein of the TGF-fJ family, comprising 
incubating an antibody or antibody fragment which specifically binds to said 

protein of the TGF-ft family with a sample suspected of containing said protein, and 

detecting any antibody/protein complex formed as an indication of the presence 
of said protein, 

wherein said protein is encoded by a DNA comprising a nucleotide sequence 
selected from the following group: 

(a) the nucleotide sequence as shown in SEQ ID NO:2, 

(b) a nucleotide sequence which is degenerate as a result of the genetic code to 
the DNA of (a), 

(c) a nucleotide sequence which hybridizes under the following stringent 
hybridization conditions to the DNA in (a), or (b): hybridization at a salt concentration of 
4X SSC at 62°-66°C followed by a one-hour wash with 0.1X SSC and 0.1% SDS at 
62°-66°C, and 

(d) fragments of (a), (b) or (c) which encode a protein which has essentially the 
same cartilage or bone inducing activity as a mature protein encoded by the nucleotide 
sequence of SEQ ID NO:2. 

30. A kit for detecting a protein of the TGF-R> family, comprising 



an antibody or antibody fragment which specifically binds to a protein of the 
TGF-ft family, and 

a reaction buffer, 

wherein said protein is encoded by a DNA comprising a nucleotide sequence 
selected from the following group: 

(a) the nucleotide sequence as shown in SEQ ID NO:1, 

(b) a nucleotide sequence which is degenerate as a result of the genetic code to 
the nucleotide sequence of (a), and 

(c) fragments of (a) or (b) which encode a protein which has essentially the same 
cartilage or bone inducing activities as a mature protein encoded by the nucleotide 
sequence of SEQ ID NO:1 . 

31 . A kit for detecting a protein of the TGF-fi family, comprising 
an antibody or antibody fragment which specifically binds to a protein of the 
TGF-li family, and 

a reaction buffer, 

wherein said protein is encoded by a DNA comprising a nucleotide sequence 
selected from the following group: 

(a) the nucleotide sequence as shown in SEQ ID NO:2, 

(b) a nucleotide sequence which is degenerate as a result of the genetic code to 
the DNA of (a), 

(c) a nucleotide sequence which hybridizes under the following stringent 
hybridization conditions to the DNA in (a), or (b): hybridization at a salt concentration of 



*4X SSC at 62°-66°C fdfcved by a one-hour wash with 0.1>Ac and 0.1% SDS at 
62°-66°C, and 

(d) fragments of (a), (b) or (c) which encode a protein which has essentially the 
same cartilage or bone inducing activity as a mature protein encoded by the nucleotide 
sequence of SEQ ID NO:2. - 

REMARKS 

The above amendments have been made to put the application into better 
condition for examination. 

In the event that any fees are due in connection with this paper, please 
charge our Deposit Account No. 14-1060. 

Respectfully submitted, 
NIKAIDO, MARMELSTEIN, MURRAY & ORAM LLP 
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The present invention relates to DNA sequences encoding novel 
growth/differentiation factors of the TGF-S family. In 
particular, it relates to novel DNA sequences encoding TGF-£~ 
like proteins, to the isolation of said DNA sequences, to 
expression plasmids containing said DNA, to microorganisms 
transformed by said expression plasmid, to the production of 
said protein by culturing said transf ormant , and to pharma- 
ceutical compositions containing said protein. The TGF-S 
family of growth factors comprising BMP , TGF, and Inhibin 
related proteins (Roberts and Sporn, Handbook of Experimental 
Pharmacology 95 (1990), 419-472) is of particular relevance 
in a wide range of medical treatments and applications. These 
factors are useful in processes relating to wound healing and 
tissue repair. Furthermore, several members of the TGF-S 
family are tissue inductive, especially osteo-inductive, and 
consequently play a crucial role in inducing cartilage and 
bone development. 

Wozney, Progress in Growth Factor Research 1 (1989) , 267-280 
and Vale et al . , Handbook of Experimental Pharmacology 95 
(1990) , 211-248 describe different growth factors such as 
those relating to the BMP (bone morphogenet ic proteins) and 
the Inhibin group. The members of these groups share 
significant structural similarity. The precursor of the 
protein is composed of an aminoterminal signal sequence, a 
propeptide and a carboxyterminal sequence of about 110 
amino acids, which is subsequently cleaved from the precursor 
and represents the mature protein. Furthermore, their members 
are defined by virtue of amino acid sequence homology. The 



mature protein contains the most conserved sequences, 
especially seven cysteine residues which are conserved among 
the family members. The TGF-S-like proteins are 
multifunctional, hormonally active growth factors. They also 
share related biological activities such as chemotactic 
attraction of cells, promoting cell differentiation and their 
tissue-inducing capacity, such as cartilage- and bone- 
inducing capacity. U.S. Patent No. 5,013,649 discloses DNA 
sequences encoding osteo- inductive proteins termed BMP- 2 
proteins (bone morphogenetic protein), and U.S. patent 
applications serial nos . 179 101 and 179 197 disclose the BMP 
proteins BMP-1 and BMP-3. Furthermore, many cell types are 
able to synthesize TGF-S-like proteins and virtually all 
cells possess TGF-S receptors. 

Taken together, these proteins show differences in their 
structure, leading to considerable variation in their 
detailed biological function. Furthermore, they are found in 
a wide variety of different tissues and developmental stages. 
Consequently, they might possess differences concerning their 
function in detail, for instance the required cellular 
physiological environment, their lifespan, their targets, 
their requirement for accessory factors, and their resistance 
to degradation. Thus, although numerous proteins exhibiting 
tissue-inductive, especially osteo- inductive potential are 
described, their natural role in the organism and, more 
importantly, their medical relevance must still be elucidated 
in detail. The occurrence of still-unknown members of the 
TGF-S family relevant for osteogenesis or 
differentiation/induction of other tissues is strongly 
suspected. However, a major problem in the isolation of these 
new TGF-S-like proteins is that their functions cannot yet be 
described precisely enough for the design of a discriminative 
bioassay. On the other hand, the expected nucleotide sequence 
homology to known members of the family would be too low to 
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allow for screening by classical nucleic acid hybridization 
techniques. Nevertheless, the further isolation and 
characterization of new TGF-S-like proteins is urgently 
needed in order to get hold of the whole set of induction and 
differentiation proteins meeting all desired medical 
requirements. These factors might find useful medical 
applications in defect healing and treatments of degenerative 
disorders of bone and/or other tissues like, for example, 
kidney and liver* 

Thus, the technical problem underlying the present invention 
essentially is to provide DNA sequences coding for new 
members of the TGF-& protein family having mitogenic and/or 
differentiation- inductive, e.g. osteo- inductive potential. 

The solution to the above technical problem is achieved by 
providing the embodiments characterized in claims 1 to 17 . 
Other features and advantages of the invention will be 
apparent from the description of the preferred embodiments 
and the drawings. The sequence listings and drawings will now 
briefly be described. 

SEP ID NO. 1 shows the nucleotide sequence of MP- 52, i.e. the 
embryo derived sequence corresponding to the mature peptide 
and most of the sequence coding for the propeptide of MP-52. 

Some of the propeptide sequence at the 5 ' -end of MP-52 has 
not been characterized so far. 

SEQ ID NO. 2 shows the nucleotide sequence of MP-121, i.e. 
the liver derived sequence corresponding to the mature 
peptide, the sequence coding for the propeptide of MP-121, 
and sequences 5 ! and 3 T to the coding region. 




9 



- 4 - 



The start codon begins with nucleotide 128 of SEQ ID NO. 2. 
The sequence coding for the mature MP121 polypeptide begins 
with nucleotide 836 of SEQ ID NO. 2, The stop codon begins 
with nucleotide 1184 of SEQ ID NO. 2. The sequence coding for 
the precursor protein has a length of 1056 bp. The sequence 
coding for the propeptide has a length of 708 bp and the 
sequence coding for the mature peptide has a length of 34 8 
bp. 

SEP ID NO . 3 shows the amino acid sequence of MP- 52 as 
deduced from SEQ ID NO. 1. 

SEP ID NO. 4 shows the amino acid sequence of MP- 121 as 
deduced from sequence SEQ ID NO. 2. The sequence of the mature 
polypeptide begins with amino acid 237 of SEQ ID NO. 4. The 
precursor protein has a length of 352 amino acids. The 
propeptide and the mature peptide have a length of 236 and 
116 amino acids, respectively. 

SEP ID NO, 5 shows a part of the nucleotide sequence of the 
liver derived sequence of MP-121. 

SEP ID NO. 6 shows a part of the nucleotide sequence of the 
embryo derived sequence of MP-52. 

The shorter DNA-sequences SEQ ID NO. 5 and 6 can be useful 
for example for isolation of further members of the TGF-S- 
protein family. 

Figure 1 shows an alignment of the amino acid sequences of 
MP-52 and MP-121 starting from the first of the seven 
conserved cysteines with some related proteins. la shows the 
alignment of MP-52 with some members of the BMP protein 
family; lb shows the alignment of MP-121 with some members of 
the Inhibin protein family. * indicates that the amino acid 
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is the same in all proteins compared; + indicates that the 
amino acid is the same in at least one of the proteins 
compared with MP-52 (Fig. la) or MP-121 (Fig. lb) . 

Figure 2 shows the nucleotide sequences of the oligo- 
nucleotide primer as used in the present invention and an 
alignment of these sequences with known members of the TGF-S 
family. M means A or C; S means C or G; R means A or G; and K 
means G or T. 2a depicts the sequence of the primer OD; 2b 
shows the sequence of the primer OID. 

The present invention relates to novel TGF-S-like proteins 
and provides DNA sequences contained in the corresponding 
genes. Such sequences include nucleotide sequences 
comprising the sequence 

ATGAACTCCATGGACCCCGAGTCCACA and 

CTTCTCAAGGCCAACACAGCTGCAGGCACC 
and in particular sequences as illustrated in SEQ ID Nos . 1 
and 2, allelic derivatives of said sequences and DNA 
sequences degenerated as a result of the genetic code for 
said sequences. They also include DNA sequences hybridizing 
under stringent conditions with the DNA sequences mentioned 
above and containing the following amino acid sequences: 

Met-Asn-Ser-Met-Asp-Pro-Glu-Ser-Thr or 

Leu-Leu -Lys -Ala -Asn-Thr-Ala-Ala-Gly-Thr. 

Although said allelic, degenerate and hybridizing sequences 
may. have structural divergencies due to naturally occurring 
mutations, such as small deletions or substitutions, they 
will usually still exhibit essentially the same useful 
properties, allowing their use in basically the same medical 
applications . 

According to the present invention, the term "hybridization" 
means conventional hybridization conditions, preferably 



t 

V 



- 6 - 



conditions with a salt concentration of 6 x SSC at 62° to 
66°C followed by a one-hour wash with 0.6 x SSC, 0.1% SDS at 
62° to 66°C. The term "hybridization" preferably refers to 
stringent hybridization conditions with a salt concentration 
of 4 x SSC at 62°-66°C followed by a one-hour wash with 0.1 x 
SSC, 0.1% SDS at 62°-66°C. 

Important biological activities of the encoded proteins, 
preferably MP-52, comprise a mitogenic and osteo- induct ive 
potential and can be determined in assays according to 
Seyedin et al., PNAS 82 (1985) , 2267-2271 or Sampath and 
Reddi, PNAS 78 (1981) , 7599-7603. 

The biological properties of the proteins according to the 
invention, preferably MP-121, may be determined, e.g., by 
means of the assays according to Wrana et al. (Cell 71, 1003- 
1014 (1992)), Ling et al . (Proc. Natl. Acad, of Science, 82, 
7217-7221 (1985)), Takuwa et al . (Am. J. Physiol., 257, E797- 
E803 (1989)), Fann and Patterson (Proc. Natl. Acad, of 
Science, 91, 43-47 (1994)), Broxmeyer et al . (Proc. Natl. 
Acad, of Science, 85, 9052-9056 (1988)), Green et al . (Cell, 
71, 731-739 (1992)), Partridge et al . (Endocrinology, 108, 
213-219 (1981)) or Roberts et al . (PNAS 78, 5339-5343 
(1981) ) . 

Preferred embodiments of the present invention are DNA 
sequences as defined above and obtainable from vertebrates, 
preferably mammals such as pig or cow and from rodents such 
as rat or mouse, and in particular from primates such as 
humans . 

Particularly preferred embodiments of the present invention 
are the DNA sequences termed MP-52 and MP-121 which are shown 
in SEQ ID Nos . 1 and 2. The corresponding transcripts of MP- 
52 were obtained from embryogenic tissue and code for a 




protein showing considerable amino acid homology to the 
mature part of the BMP-like proteins (see Fig. la). The 
protein sequences of BMP2 (=BMP2A) and BMP4 (=BMP2B) are 
described in Wozney et al . , Science Vol 242, 1528-1534 
(1988). The respective sequences of BMP 5 , BMP6 and BMP 7 are 
described in Celeste et al . , Proc .Natl .Acad. Sci . USA Vol 87, 
9843-9847 (1990) . Some typical sequence homologies, which are 
specific to known BMP- sequences only, were also found in the 
propeptide part of MP- 52, whereas other parts of the 
precursor part of MP- 52 show marked differences to BMP- 
precursors. The mRNA of MP- 121 was detected in liver tissue, 
and its correspondig amino acid sequence shows homology to 
the amino acid sequences of the Inhibin protein chains (see 
Fig. lb) . cDNA sequences encoding TGF-S-like proteins have 
not yet been isolated from liver tissue, probably due to a 
low abundance of TGF-S specific transcripts in this tissue. 
In embryogenic tissue, however, sequences encoding known TGF- 
S-like proteins can be found in relative abundance. The 
inventors have recently detected the presence of a collection 
of TGF-S-like proteins in liver as well. The high background 
level of clones related to known factors of this group 
presents the main difficulty in establishing novel TGF-S- 
related sequences from these and probably other tissues. In 
the present invention, the cloning was carried out according 
to the method described below. Once the DNA sequence has been 
cloned, the preparation of host cells capable of producing 
the TGF-S-like proteins and the production of said proteins 
can be easily accomplished using known recombinant DNA 
techniques comprising constructing the expression plasmids 
encoding said protein and transforming a host cell with said 
expression plasmid, cultivating the transformant in a 
suitable culture medium, and recovering the product having 
TGF-S-like activity. 
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Thus, the invention also relates to recombinant molecules 
comprising DNA sequences as described above, optionally 
linked to an expression control sequence. Such vectors may be 
useful in the production of TGF-S-like proteins in stably or 
transiently transformed cells. Several animal, plant, fungal 
and bacterial systems may be employed for the transformation 
and subsequent cultivation process. Preferably, expression 
vectors which can be used in the invention contain sequences 
necessary for the replication in the host cell and are 
autonomously replicable. It is also preferable to use vectors 
containing selectable marker genes which can be easily 
selected for transformed cells. The necessary operation is 
well-known to those skilled in the art. 

It is another object of the invention to provide a host cell 
transformed by an expression plasmid of the invention and 
capable of producing a protein of the TGF-S family. Examples 
of suitable host cells include various eukaryotic and 
prokaryotic cells, such as E. coli, insect cells, plant 
cells, mammalian cells, and fungi such as yeast. 

Another object of the present invention is to provide a 
protein of the TGF-S family encoded by the DNA sequences 
described above and displaying biological features such as 
tissue-inductive, in particular osteo- inductive and/or 
mitogenic capacities possibly relevant to therapeutical 
treatments. The above-mentioned features of the protein might 
vary depending upon the formation of homodimers or 
heterodimers . Such structures may prove useful in clinical 
applications as well. The amino acid sequence of the 
especially preferred proteins of the TGF-S- family (MP-52 and 
MP-121) are shown in SEQ ID NO. 3 and SEQ ID NO . 4. 

It is a further aspect of the invention to provide a process 
for the production of TGF-E- like proteins. Such a process 
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comprises cultivating a host cell being transformed with a 
DNA sequence of the present invention in a suitable culture 
medium and purifying the TGF-£-like protein produced. Thus, 
this process will allow the production of a sufficient amount 
of the desired protein for use in medical treatments or in 
applications using cell culture techniques requiring growth 
factors for their performance. The host cell is obtainable 
from bacteria such as Bacillus or Escherichia coli, from 
fungi such as yeast, from plants such as tobacco, potato, or 
Arabidopsis, and from animals, in particular vertebrate cell 
lines such as the Mo-, COS- or CHO cell line. 

Yet another aspect of the present invention is to provide a 
particularly sensitive process for the isolation of DNA 
sequences corresponding to low abundance mRNAs in the tissues 
of interest. The process of the invention comprises the 
combination of four different steps. First, the mRNA has to 
be isolated and used in an amplification reaction using 
olignucleotide primers. The sequence of the oligonucleotide 
primers contains degenerated DNA sequences derived from the 
amino acid sequence of proteins related to the gene of 
interest. This step may lead to the amplification of already 
known members of the gene family of interest, and these 
undesired sequences would therefore have to be eliminated. 
This object is achieved by using restriction endonucleases 
which are known to digest the already-analyzed members of the 
gene family. After treatment of the amplified DNA population 
with said restriction endonucleases, the remaining desired 
DNA sequences are isolated by gel electrophoresis and 
reamplified in a third step by an amplification reaction, and 
in a fourth step they are cloned into suitable vectors for 
sequencing. To increase the sensitivity and efficiency, steps 
two and three are repeatedly performed, at least two times in 
one embodiment of this process . 
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In a preferred embodiment, the isolation process described 
above is used for the isolation of DNA sequences from liver 
tissue. In a particularly preferred embodiment of the above- 
described process, one primer used for the PGR experiment is 
homologous to the polyA tail of the mRNA, whereas the second 
primer contains a gene-specific sequence. The techniques 
employed in carrying out the different steps of this process 
(such as amplification reactions or sequencing techniques) 
are known to the person skilled in the art and described, for 
instance, in Sambrook et al . , 1989, "Molecular Cloning: A 
laboratory manual", Cold Spring Harbor Laboratory Press. 

It is another object of the present invention to provide 
pharmaceutical compositions containing a therapeutically- 
effective amount of a protein of the TGF-S family of the 
present invention. Optionally, such a composition comprises a 
pharmaceutically acceptable carrier. Such a therapeutic 
composition can be used in wound healing and tissue repair as 
well as in the healing of bone, cartilage, or tooth defects, 
either individually or in conjunction with suitable carriers, 
and possibly with other related proteins or growth factors. 
Thus, a therapeutic composition of the invention may include, 
but is not limited to, the MP-52 encoded protein in 
conjunction with the MP-121 encoded protein, and optionally 
with other known biologically-active substances such as EGF 
{epidermal growth factor) or PDGF (platelet derived growth 
factor) . Another possible clinical application of a TGF-E- 
like protein is the use as a suppressor of the immuno 
response, which would prevent rejection of organ transplants. 
The pharmaceutical composition comprising the proteins of the 
invention can also be used prophylact ically , or can be 
employed in cosmetic plastic surgery. Furthermore, the 
application of the composition is not limited to humans but 
can include animals, in particular domestic animals, as 
well. Possible applications of the pharmaceutical composition 
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according to the invention include furthermore treatment or 
prevention of connective tissue, skin, mucous membrane, 
endothelial, epithelial, neuronal or renal defects, use in 
the case of dental implants, use as a morphogenic factor used 
for inducing liver tissue growth, induction of the 
proliferation of precursor cells or bone marrow cells, for 
maintaining a differentiated state and the treatment of 
impaired fertility or for contraception. 

Finally, another object of the present invention is an 
antibody or antibody fragment, which is capable of 
specifically binding to the proteins of the present 
invention. Methods to raise such specific antibody are 
general knowledge. Preferably such an antibody is a 
monoclonal antibody. Such antibodies or antibody fragments 
might be useful for diagnostic methods. 

The following examples illustrate in detail the invention 
disclosed, but should not be construed as limiting the 
invention . 

Example 1 
Isolation of MP-121 

1.1 Total RNA was isolated from human liver tissue (40 -year- 
old-male) by the method of Chirgwin et al . , Biochemistry 
18 (1979), 5294-5299. Poly A + RNA was separated from 
total RNA by oligo (dT) chromatography according to the 
instructions of the manufacturer (Stratagene Poly (A) 
Quick columns) . 

1.2 For the reverse transcription reaction, poly A + RNA (1- 
2.5 /ig) derived from liver tissue was heated for 5 
minutes to 65°C and cooled rapidly on ice. The reverse 
transcription reagents containing 27 U RNA guard 
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(Pharmacia), 2.5 fig oligo d(T) 12 . 18 (Pharmacia) 5 x 
buffer (250 mM Tris/HCl pH 8.5; 50 mM MgCl 2 ; 50 mM DTT; 
5 mM each dNTP; 600 mM KCl) and 20 units avian 
myeloblastosis virus reverse transcriptase (AMV, 
Boehringer Mannheim) per /xg poly (A + ) RNA were added. 
The reaction mixture (25 fil) was incubated for 2 hours 
at 42°C. The liver cDNA pool was stored at -20°C. 

1.3 The deoxynucleotide primers OD and OID (Fig. 2) designed 
to prime the amplification reaction were generated on an 
automated DNA- synthesizer (Biosearch) . Purification was 
done by denaturating polyacrylamide gel electrophoresis 
and isolation of the main band from the gel by 
isotachophoresis . The oligonucleotides were designed by 
aligning the nucleic acid sequences of some known 
members of the TGF-6 family and selecting regions of the 
highest conservation. An alignment of this region is 
shown in Fig. 2. In order to facilitate cloning, both 
oligonucleotides contained EcoR I restriction sites and 
OD additionally contained an Nco I restriction site at 
its 5 1 terminus . 

1.4 In the polymerase chain reaction, a liver-derived cDNA 
pool was used as a template in a 50 fil reaction mixture. 
The amplification was performed in 1 x PCR-buffer (16.6 
mM (NH 4 ) 2 S0 4 ; 67 mM Tris/HCl pH 8.8; 2 mM MgCl 2 ; 6 . 7 fiM 
EDTA; 10 mM S-mercaptoethanol ; 170 fig /ml BSA (Gibco) ) , 
200 fiM each dNTP (Pharmacia) , 30 pmol each 
oligonucleotide (OD and OID) and 1.5 units Taq 
polymerase (AmpliTaq, Perkin Elmer Cetus) . The PCR 
reaction contained cDNA corresponding to 3 0 ng of poly 
(A + ) RNA as staring material. The reaction mixture was 
overlayed by paraffine and 40 cycles (cycle 1: 80s 
93°C/40s 52°C/40s 72°C; cycles 2-9: 60s 93°C/40s 
52°C/40s 72°C; cycles 10-29: 60s 93°C/40s 52°C/60s 



- 13 - 



72°C; cycles 30-31: 60s 93°C/40s 52°C/90s 72°C; cycle 
40: 60s 93°C/40s 52°C/420s 72°C) of the PGR were 
performed. Six PCR-reaction mixtures were pooled, 
purified by subsequent extractions with equal volumes of 
phenol, phenol/chloroform (1:1 (v/v) ) and 
chloroform/ isoatnylalcohol (24:1 (v/v)) and concentrated 
by ethanol precipitation. 

1.5 One half of the obtained PCR pool was sufficient for 
digestion with the restriction enzymes Sph I (Pharmacia) 
and AlwN I (Biolabs) . The second half was digested in a 
series of reactions by the restriction enzymes Ava I 
(BRL) , AlwN I (Biolabs) and Tfi I (Biolabs) . The 
restriction endonuclease digestions were performed in 
100 fxl at 37°C (except Tfi I at 65°C) using 8 units of 
each enzyme in a 2- to 12 -hour reaction in a buffer 
recommended by the manufacturer. 

1.6 Each DNA sample was fractioned by electrophoresis using 
a 4% agarose gel (3% FMC Nusieve agarose, Biozym and 1% 
agarose, BRL) in Tris borate buffer (89 mM Trisbase, 8 9 
mM boric acid, 2 mM EDTA, pH 8) . After ethidiumbromide 
staining uncleaved amplification products (about 200 bp; 
size marker was run in parallel) were excised from the 
gel and isolated by phenol extraction: an equal volume 
of phenols was added to the excised agarose, which was 
minced to small pieces, frozen for 10 minutes, vortexed 
and centrifuged. The aqueous phase was collected, the 
interphase reextracted by the same volume TE-buffer, 
centrifuged and both aqueous phases were combined. DNA 
was further purified twice by phenol/chloroform and once 
by chloroform/ isoamylalcohol extraction . 

1.7 After ethanol precipitation, one fourth or one fifth of 
the isolated DNA was reamplified using the same 
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52°C/60s 72°C; cycle 13: 60s 93°C/40s 52°C/420s 
72°C) . The reamplif ication products were purified, 
restricted with the same enzymes as above and the 
uncleaved products were isolated from agarose gels as 
mentioned above for the amplification products. The 
reamplif ication followed by restriction and gel 
isolation was repeated once. 

1.8 After the last isolation from the gel, the amplification 
products were digested by 4 units EcoR I (Pharmacia) for 
2 hours at 37°C using the buffer recommended by the 
manufacturer. One fourth of the restriction mixture was 
ligated to the vector pBluescriptll SK+ (Stratagene) 
which was digested likewise by EcoR I. After ligation, 
24 clones from each enzyme combination were further 
analyzed by sequence analysis. The sample restricted by 
AlwN I and Sph I contained no new sequences, only BMP6 
and Inhibin &A sequences. 19 identical new sequences, 
which were named MP- 121 , were found by the Ava I, AlwN I 
and Tfi I restricted samples. The MP-121 containing 
plasmids were called pSK MP-121 (OD/OID) . One sequence 
differed from this mainly- found sequence by two 
nucleotide exchanges. Ligation reaction and 
transformation in E. coli HB101 were performed as 
described in Sambrook et al . , Molecular cloning: A 
laboratory manual (1989) . Transf ormants were selected by 
Ampicillin resistance and the plasmid DNAs were isolated 
according to standard protocols (Sambrook et al . 
(1989) ) . Analysis was done by sequencing the double- 
stranded plasmids by " dideoxyribonucleotide chain 
termination sequencing" with the sequencing kit 
"Sequenase Version 2.0" (United States Biochemical 
Corporation) . 
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The clone was completed to the 3* end of the c-DNA by a 
method described in detail by Frohman (Amplifications, 
published by Perkin-Elmer Corporation, issue 5 (1990) , 
pp 11-15) . The same liver mRNA which was used for the 
isolation of the first fragment of MP-121 was reverse 
transcribed using a primer consisting of oligo dT (16 
residues) linked to an adaptor primer 

(AGAATTCGCATGCCATGGTCGACGAAGC (T) 1 6 ) . Amplification was 
performed using the adaptor primer 

(AGAATTCGCATGCCATGGTCGACG) and an internal primer 
(GGCTACGCCATGAACTTCTGCATA) of the MP-121 sequence. The 
amplification products were reamplified using a nested 
internal primer (ACATAGCAGGCATGCCTGGTATTG) of the MP-121 
sequence and the adaptor primer. The reamplif ication 
products were cloned after restriction with Sph I in the 
likewise restricted vector pT7/T3 U19 (Pharmacia) and 
sequenced with the sequencing kit "Sequenase Version 
2.0" (United States Biochemical Corporation). Clones 
were characterized by their sequence overlap to the 3 1 
end of the known MP-121 sequence. 

One clone, called pl21Lt 3' MP13, was used to isolate a 
Ncol (blunt ended with T4 polymerase) /SphI fragment. 
This fragment was ligated into a pSK MP-121 (OD/OID) 
vector, where the OD primer sequence was located close 
to the T7 primer sequence of the pSK+ multiple cloning 
site, opened with Sphl/Smal. The resulting plasmid was 
called pMP121DFus6. It contains MP-121 specific sequence 
information starting from position 922 and ending with 
position 1360 of SEQ ID NO. 2. 

1.9 Using a Ddel fragment of pMP-121DFus6 as a probe, 

ranging from nucleotide 931 to nucleotide 1304 of SEQ ID 
NO. 2, a human liver cDNA library (Clontech, # HL3006b, 
Lot 36223) was screened by a common method described in 
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detail by Ausubel et al. (Current Protocols in Molecular 
Biology, published by Greene Publishing Associates and 
Wiley- Interscience (1989)). From 8.1 x 10 5 phages, 24 
mixed clones were isolated and re-screened using the 
Ddel fragment. 10 clones were confirmed and the EcoRI 
fragments subcloned into Bluescript SK (Stratagene, # 
212206) . EcoRI restriction analysis showed that one 
clone (SK121 L9.1, deposited by the DSM (#9177) has an 
insert of about 2.3 kb. This clone contains the complete 
reading frame of the MP121 gene and further information 
to the 5 1 and 3' end in addition to the sequence 
isolated from mRNA by the described amplification 
methods. The complete sequence of the EcoRI insert of 
SK121 L9.1 is shown in SEQ ID NO. 2. The reading frame of 
the MP-121 gene could be confirmed by sequencing of 
another clone (SK121 Lll.l), having the identical 
reading frame sequence as SK121 L9 . 1 . The beginning of 
the start codon of the MP-121 sequence of SK121 L9 . 1 
could be determined at position 128 of SEQ ID NO . 2 , 
since there are three stop codons in- frame in front of 
the start codon at positions 62, 77 and 92. The start 
site of the mature MP-121 is at position 836 of SEQ ID 
NO. 2 in sequence analogy to other members of the TGF-S- 
family, corresponding to amino acid 237 in SEQ ID NO. 4. 
The stop codon is at position 1184 of SEQ ID NO. 2. 

Plasmid SK121 L9 . 1 was deposited under number 9177 at 
DSM (Deutsche Sammlung von Mikroorganismen und 
Zellkulturen) , Mascheroder Weg lb, Braunschweig, on 
26 . 04 . 94) . 
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Example 2 
Isolation of MP-52 

A further cDNA sequence, MP-52, was isolated according to the 
above described method (Example 1) by using RNA from human 
embryo (8-9 weeks old) tissue. The PCR reaction contained 
cDNA corresponding to 2 0 ng of poly (A + ) RNA as starting 
material. The reamplif ication step was repeated twice for 
both enzyme combinations. After ligation, 24 clones from each 
enzyme combination were further analyzed by sequence 
analysis. The sample resticted by AlwN I and Sph I yielded a 
new sequence which was named MP-52. The other clones 
comprised mainly BMP6 and one BMP 7 sequence. The sample 
restricted by Ava I, AlwN I and Tfi I contained no new 
sequences, but consisted mainly of BMP 7 and a few Inhibin ISA 
sequences . 

The clone was completed to the 3 ! end according to the above 
described method (Example 1) . The same embryo mRNA, which was 
used for the isolation of the first fragment of MP-52, was 
reverse transcribed as in Example 1. Amplification was 
performed using the adaptor primer ( AGAATTCGCATGCCATGGTCGACG) 
and an internal primer (CTTGAGTACGAGGCTTTCCACTG) of the MP-52 
sequence. The amplification products were reamplif ied using a 
nested adaptor primer (ATTCGCATGCCATGGTCGACGAAG) and a nested 
internal primer (GGAGCCCACGAATCATGCAGTCA) of the MP-52 
sequence. The reamplif ication products were cloned after 
restriction with Nco I in a likewise restricted vector (pUC 
19 (Pharmacia #27-4951-01) with an altered multiple cloning 
site containing a unique Nco I restriction site) and 
sequenced. Clones were characterized by their sequence 
overlap to the 3' end of the known MP-52 sequence. Some of 
these clones contain the last 143 basepairs of the 3' end of 
the sequence shown in SEQ ID NO: 1 and the 0,56 kb 3' non 
translated region (sequence not shown) . One of these was used 
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as a probe to screen a human genomic library (Stratagene 
#946203) by a common method described in detail by Ausubel et 
al . (Current Protocols in Molecular Biology, published by 
Greene publishing Associates and Wiley- Interscience (1989) ) . 
From 8xl0 5 X phages one phage (X 2.7.4) which was proved to 
contain an insert of about 2 0 kb, was isolated and deposited 
by the DSM (#7387) . This clone contains in addition to the 
sequence isolated from mRNA by the described amplification 
methods sequence information further to the 5 1 end. For 
sequence analysis a Hind III fragment of about 7,5 kb was 
subcloned in a likewise restricted vector (Bluescript SK, 
Stratagene #212206) . This plasmid, called SKL 52 (H3) MP12, 
was also deposited by the DSM (# 7353) . Sequence information 
derived from this clone is shown in SEQ ID NO: 1. At 
nucleotide No. 1050, the determined cDNA and the respective 
genomic sequence differ by one basepair (cDNA: G; genomic 
DNA: A). We assume the genomic sequence to be correct, as it 
was confirmed also by sequencing of the amplified genomic DNA 
from embryonic tissue which had been used for the mRNA 
preparation. The genomic DNA contains an intron of about 2 kb 
between basepairs 3 32 and 333 of SEQ ID NO: 1. The sequence 
of the intron is not shown. The correct exon/exon junction 
was confirmed by sequencing an amplification product derived 
from cDNA which comprises this region. This sequencing 
information was obtained by the help of a slightly modified 
method described in detail by Frohman (Amplifications, 
published by Perkin-Elmer Corporation, issue 5 (1990) , pp 11- 
15) . The same embryo RNA which was used for the isolation of 
the 3 1 end of MP-52 was reverse transcribed using an internal 
primer of the MP- 52 sequence oriented in the 5 1 direction 
(ACAGCAGGTGGGTGGTGTGGACT) . A polyA tail was appended to the 
5 1 end of the first strand cDNA by using terminal 
transferase. A two step amplification was performed first by 
application of a primer consisting of oligo dT and an adaptor 
primer { AGAATTCGCATGCCATGGTCGACGAAGC (T 1 6 ) ) and secondly an 
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adaptor primer (AGAATTCGCATGCCATGGTCGACG) and an internal 
primer (CCAGCAGCCCATCCTTCTCC) of the MP-52 sequence. The 
amplification products were reamplified using the same 
adaptor primer and a nested internal primer 
(TCCAGGGCACTAATGTCAAACACG) of the MP-52 sequence. 
Consecutively the reamplif ication products were again 
reamplified using a nested adaptor primer 
(ATTCGCATGCCATGGTCGACGAAG) and a nested internal primer 
(ACTAATGTCAAACACGTACCTCTG) of the MP-52 sequence. The final 
reamplif ication products were blunt end cloned in a vector 
(Bluescript SK, Stratagene #212206) restricted with EcoRV. 
Clones were characterized by their sequence overlap to the 
DNA of JL 2.7.4. 

Plasmid SKL 52 (H3) MP12 was deposited under number 7353 at 
DSM (Deutsche Sammlung von Mikroorganismen und Zellkulturen) , 
Mascheroder Weg lb, 3300 Braunschweig, on 10.12.1992. 

Phage A 2.7. 4. was deposited under number 7387 at DSM on 
13 .1.1993 . 
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Claims 

A DNA sequence encoding a protein of the TGF-S family 
selected from the following group: 

(a) a DNA sequence comprising the nucleotides 

ATGAACTCCATGGACCCCGAGTCCACA 

with the reading frame for the protein starting at the 
first nucleotide 

(b) a DNA sequence comprising the nucleotides 

CTTCTCAAGGCCAACACAGCTGCAGGCACC 

with the reading frame for the protein starting at the 
first nucleotide 

(c) DNA sequences which are degenerate as a result of 
the genetic code to the DNA sequences of (a) and 
(b) 

(d) allelic derivatives of the DNA sequences of (a) an 
(b) 

(e) DNA sequences hybridizing to the DNA sequences in 
(a) , (b) , (c) or (d) and encoding a protein 
containing the aminoacid sequence 

Met-Asn-Ser-Met-Asp-Pro-Glu-Ser-Thr 



or 
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Leu-Leu-Lys-Ala-Asn-Thr-Ala-Ala-Gly-Thr 

(f) DNA sequences hybridizing to the DNA sequences in 

(a) , (b) , (c) and (d) and encoding a protein having 
essentially the same biological properties. 

2. The DNA sequence according to claim 1 which is a 
vertebrate DNA sequence, a mammalian DNA sequence, 
preferably a primate, human, porcine, bovine, or rodent 
DNA sequence, and preferably including a rat and a mouse 
DNA sequence . 

3. The DNA sequence according to claim 1 or 2 which is a 
DNA sequence comprising the nucleotides as shown in SEQ 
ID NO. 1. 

4 . The DNA sequence according to claim 1 or 2 which is a 
DNA sequence comprising the nucleotides as shown in SEQ 
ID NO. 2. 

5. The DNA sequence according to claim 1 or 2 which is a 
DNA sequence comprising the nucleotides as shown in SEQ 
ID NO. 5 . 

6. The DNA sequence according to claim 1 or 2 which is a 
DNA sequence comprising the nucleotides as shown in SEQ 
ID NO. 6 . 

7. A recombinant DNA molecule comprising a DNA sequence 
according to any one of claims 1 to 6 . 



The recombinant DNA molecule according to claim 7 in 
which said DNA sequence is functionally linked to an 
expression-control sequence . 
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9. A host containing a recombinant DNA molecule according 
to claim 7 or 8 . 

10. The host according to claim 9 which is a bacterium, a 
fungus, a plant cell or an animal cell. 

11. A process for the production of a protein of the TGF-S 
family comprising cultivating a host according to claim 
9 or 10 and recovering said TGF-S protein from the 
culture . 

12. A protein of the TGF-S family encoded by a DNA sequence 
according to any one of claims 1 to 4 or a fragment 
thereof encoded by a DNA- sequence according to claim 5 
or 6 . 

13 . A protein according to claim 12 comprising the amino 
acid sequence of SEQ ID NO: 3. 

14 . A protein according to claim 12 comprising the amino 
acid sequence of SEQ ID NO. 4. 

15. A pharmaceutical composition containing a protein of the 
TGF-S family according to any one of claims 12 to 14 , ■ 
optionally in combination with a pharmaceutically 
acceptable carrier . 

16. The pharmaceutical composition according to claim 15 
for the treatment or prevention of bone, cartilage, 
connective tissue, skin, mucous membrane, endothelial, 
epithelial, neuronal, renal or tooth defects, for use in 
the case of dental implants, for use in wound healing or 
tissue repair processes, as a morphogenic factor used 
for inducing liver tissue growth, induction of the 
proliferation of precursor cells or bone marrow cells, 
for maintaining a differentiated state and for the 
treatment of impaired fertility or for contraception. 



i * • 
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17. An antibody or antibody fragment which is capable of 
specifically binding to a protein of claims 12, 13 or 
14 . 

18. Antibody or antibody fragment according to claim 17 
which is a monoclonal antibody. 

19. Use of an antibody or antibody fragment according to 
claims 17 or 18 for diagnostic methods. 




Abstract 



The invention provides DNA sequences encoding novel members 
of the TGF-S family of proteins. The TGF-S family comprises 
proteins which function as growth and/or differentiation 
factors and which are useful in medical applications. 
Accordingly, the invention also describes the isolation of 
the above-mentioned DNA sequences, the expression of the 
encoded proteins, the production of said proteins and 
pharmaceutical compositions containing said proteins. 
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Fig.2a 



Eco Rl Nco I 



OD 




ATGAATTCCCAT6GACCT6GGCT6GMAKGAMT66AT 


BMP 2 




ACGTGGGGTGGAATGACTGGAT 


RMP 1 
DIYlr 0 




AT ATTftftTTGn A£T£ A ATC£ AT 
A 1 A 1 1 bou 1 boAo 1 uAA 1 uuA 1 


BMP 4 




ATGTGGGCTGGAATGACTGGAT 


BMP 7 




ACCTGGGCTGGCAGGACTGGAT 


TGF-pi 




AGGACCTCGGCTGGAAGTGGAT 


T6F-P2 




GGGATCTAGGGTGGAAATGGAT 


T6F-p3 




AGGATCTGGGCTGGAAGTGGGT 


inhibin 


a 


AGCTGGGCTGGGAACGGTGGAT 


inhibin 


PA 


ACATCGGCTGGAATGACTGGAT 


inhibin 


PB 


TCATCGGCTGGAACGACTGGAT 



Fig.2b 



Eco Rl 



OID 




ATGAATTCGAGCTGCGTSGGSRCACAGCA 


BMP 2 




GAGTTCTGTCGGGACACAGCA 


BMP 3 




CATCTTTTCTGGTACACAGCA 


BMP 4 




CAGTTCAGTGGGCACACAACA 


BMP 7 




GAGCTGCGTGGGCGCACAGCA 


TGF-PI 




CAGCGCCTGCGGCACGCAGCA 


TGF-P2 




TAAATCTTGGGACACGCAGCA 


TGF-P3 




CAGGTCCTGGGGCACGCAGCA 


inhibin 


a 


CCCTGGGAGAGCAGCACAGCA 


inhibin 


PA 


CAGCTTGGTGGGCACACAGCA 


inhibin 


PB 


CAGCTTGGTGGGAATGCAGCA 



SEQ ID NO. 1 

SEQUENCE TYPE: Nucleic Acid 
SEQUENCE LENGTH: 12 07 Base Pairs 

STRAND EDNESS : Double or Single 
TOPOLOGY: Linear 

MOLECULAR TYPE: DNA or cDNA from mRNA 

ORIGINAL SOURCE: - 
ORGANISM: Human 

IMMEDIATE EXPERIMENTAL SOURCE: Embryo Tissue 

PROPERTIES: Sequence Coding for Human TGF-S-like Protein (MP-52) 

ACCGGGCGGC CCTGAACCCA AGCCAGGACA CCCTCCCCAA ACAAGGCAGG CTACAGCCCG 6 0 

GACTGTGACC CCAAAAGGAC AGCTTCCCGG AGGCAAGGCA CCCCCAAAAG CAGGATCTGT 12 0 

CCCCAGCTCC TTCCTGCTGA AG AAGGC C AG GGAGCCCGGG CCCCCACGAG AGCCCAAGGA 180 

GCCGTTTCGC CCACCCCCCA TCACACCCCA CGAGTACATG CTCTCGCTGT ACAGGACGCT 24 0 

GTCCGATGCT GACAGAAAGG GAGG CAACAG CAGCGTGAAG TTGGAGGCTG GCCTGGCCAA 3 00 

CACCATCACC AGCTTTATTG ACAAAGGGCA AGATGACCGA GGTCCCGTGG TCAGGAAGCA 360 

GAGGTACGTG TTTGACATTA GTGCCCTGGA GAAGGATGGG CTGCTGGGGG CCGAGCTGCG 42 0 

GATCTTGCGG AAGAAGCCCT CGGACACGGC CAAGCCAGCG GCCCCCGGAG GCGGGCGGGC 480 

TGCCCAGCTG AAGCTGTCCA GCTGCCCCAG CGGCCGGCAG CCGGCCTCCT TGCTGGATGT 54 0 

GCGCTCCGTG CCAGGCCTGG ACGGATCTGG CTGGGAGGTG TTCGACATCT GGAAGCTCTT 6 00 

CCGAAACTTT AAGAACTCGG CCCAGCTGTG CCTGGAGCTG GAGGCCTGGG AACGGGGCAG 660 

GGCCGTGGAC CTCCGTGGCC TGGGCTTCGA CCGCGCCGCC CGGCAGGTCC ACGAGAAGGC 72 0 

CCTGTTCCTG GTGTTTGGCC GCACCAAGAA ACGGGACCTG TTCTTTAATG AGATTAAGGC 78 0 

CCGCTCTGGC CAGGACGATA AGACCGTGTA TGAGTACCTG TTCAGCCAGC GGCGAAAACG 84 0 

GCGGGCCCCA CTGGCCACTC GCCAGGGCAA GCGACCCAGC AAGAACCTTA AGGCTCGCTG 900 

CAGTCGGAAG GCACTGCATG TCAACTTCAA GGACATGGGC TGGGACGACT GGATCATCGC 96 0 

ACCCCTTGAG TACGAGGCTT TCCACTGCGA GGGGCTGTGC GAGTTCCCAT TGCGCTCCCA 102 0 

CCTGGAGCCC ACGAATCATG CAGTCATCCA GACCCTGATG AACTCCATGG ACCCCGAGTC 108 0 

CACACCACCC ACCTGCTGTG TGCCCACGCG GCTGAGTCCC ATCAGCATCC TCTTCATTGA 114 0 

CTCTGCCAAC AACGTGGTGT ATAAGCAGTA TGAGGACATG GTCGTGGAGT CGTGTGGCTG 12 0 0 

CAGGTAG 12 07 



SEQ ID NO. 2 



SEQUENCE TYPE: Nucleic Acid 
SEQUENCE LENGTH: 22 72 Base Pairs 

STRANDEDNESS : Double or Single 
TOPOLOGY: Linear 

MOLECULAR TYPE: cDNA from mRNA 

ORIGINAL SOURCE: - 
ORGANISM: Human 

IMMEDIATE EXPERIMENTAL SOURCE: Liver Tissue 

PROPERTIES: Sequence Coding for Human TGF-£-like Protein (MP-121) 

CAAGGAGCCA TGCCAGCTGG ACACACACTT CTTCCAGGGC CTCTGGCAGC CAGGACAGAG 6 0 

TTGAGACCAC AGCTGTTGAG ACCCTGAGCC CTGAGTCTGT ATTGCTCAAG AAGGGCCTTC 12 0 

CCCAGCAATG ACCTCCTCAT TGCTTCTGGC CTTTCTCCTC CTGGCTCCAA CCACAGTGGC 18 0 

CACTCCCAGA GCTGGCGGTC AGTGTCCAGC ATGTGGGGGG CCCACCTTGG AACTGGAGAG 24 0 

CCAGCGGGAG CTGCTTCTTG ATCTGGCCAA GAGAAGCATC TTGGACAAGC TGCACCTCAC 300 

CCAGCGCCCA ACACTGAACC GCCCTGTGTC CAGAGCTGCT TTGAGGACTG CACTGCAGCA 36 0 

CCTCCACGGG GTCCCACAGG GGGCACTTCT AGAGGACAAC AGGGAACAGG AATGTGAAAT 42 0 

CATCAGCTTT GCTGAGACAG GCCTCTCCAC CATCAACCAG ACTCGTCTTG ATTTTCACTT 480 

CTCCTCTGAT AGAACTGCTG GTGACAGGGA GGTCCAGCAG GCCAGTCTCA TGTTCTTTGT 54 0 

GCAGCTCCCT TCCAATACCA CTTGGACCTT GAAAGTGAGA GTCCTTGTGC TGGGTCCACA 6 00 

TAATACCAAC CTCACCTTGG CTACTCAGTA CCTGCTGGAG GTGGATGCCA GTGGCTGGCA 660 

TCAACTCCCC CTAGGGCCTG AAGCTCAAGC TGCCTGCAGC CAGGGGCACC TGACCCTGGA 72 0 

GCTGGTACTT GAAGGCCAGG TAGCCCAGAG CTCAGTCATC CTGGGTGGAG CTGCCCATAG 780 

GCCTTTTGTG GCAGCCCGGG TGAGAGTTGG GGGCAAACAC CAGATTCACC GACGAGGCAT 84 0 

CGACTGCCAA GGAGGGTCCA GGATGTGCTG TCGACAAGAG TTTTTTGTGG ACTTCCGTGA 90 0 

GATTGGCTGG CACGACTGGA TCATCCAGCC TGAGGGCTAC GCCATGAACT TCTGCATAGG 96 0 

GCAGTGCCCA CTACACATAG CAGGCATGCC TGGTATTGCT GCCTCCTTTC ACACTGCAGT 102 0 

GCTCAATCTT CTCAAGGCCA ACACAGCTGC AGGCACCACT GGAGGGGGCT CATGCTGTGT 108 0 

ACCCACGGCC CGGCGCCCCC TGTCTCTGCT CTATTATGAC AGGGACAGCA ACATTGTCAA 114 0 

GACTGACATA CCTGACATGG TAGTAGAGGC CTGTGGGTGC AGTTAGTCTA TGTGTGGTAT 12 0 0 

GGGCAGCCCA AGGTTGCATG GGAAAACACG CCCCTACAGA AGTGCACTTC CTTGAGAGGA 12 60 

GGGAATGACC TCATTCTCTG TCCAGAATGT GGACTCCCTC TTCCTGAGCA TCTTATGGAA 13 2 0 

ATTACCCCAC CTTTGACTTG AAGAAACCTT CATCTAAAGC AAGTCACTGT GCCATCTTCC 13 8 0 

TGACCACTAC CCTCTTTCCT AGGGCATAGT CCATCCCGCT AGTCCATCCC GCTAGCCCCA 144 0 



CTCCAGGGAC TCAGACCCAT CTCCAACCAT GAGCAATGCC ATCTGGTTCC CAGGCAAAGA 1500 

CACCCTTAGC TCACCTTTAA TAGACCCCAT AACCCACTAT GCCTTCCTGT CCTTTCTACT 156 0 

CAATGGTCCC CACTCCAAGA TG AG TTG AC A CAACCCCTTC CCCCAATTTT TGTGGATCTC 16 2 0 

CAGAGAGGCC CTTCTTTGGA TTCACCAAAG TTTAGATCAC TGCTGCCCAA AATAGAGGCT 168 0 

TACCTACCCC CCTCTTTGTT GTGAGCCCCT GTCCTTCTTA GTTGTCCAGG TGAACTACTA 174 0 

AAGCTCTCTT TGCATACCTT CATCCATTTT TTGTCCTTCT CTGCCTTTCT CTATGCCCTT 18 00 

AAGGGGTGAC TTGCCTGAGC TCTATCACCT GAGCTCCCCT GCCCTCTGGC TTCCTGCTGA 1860 

GGTCAGGGCA TTTCTTATCC CTGTTCCCTC TCTGTCTAGG TGTCATGGTT CTGTGTAACT 192 0 

GTGGCTATTC TGTGTCCCTA CACTACCTGG CTACCCCCTT CCATGGCCCC AGCTCTGCCT 19 8 0 

AC ATT CTGAT TTTTTTTTTT TTTTTTTTTT TGAAAAGTTA AAAATTCCTT AATTTTTTAT 2 04 0 

TCCTGGTACC ACTAC CACAA TTTACAGGGC AATATACCTG ATGTAATGAA AAGAAAAAGA 2100 

AAAAGACAAA GCTACAACAG AT AAAAG AC C TCAGGAATGT ACATCTAATT GACACTACAT 216 0 

TGCATTAATC AATAGCTGCA CTTTTTGCAA ACTGTGGCTA TGACAGTCCT GAACAAGAAG 222 0 

GGTTTCCTGT TTAAGCTGCA GTAACTTTTC TGACTATGGA TCATCGTTCC TT 22 72 



SEQ ID NO. 3 

SEQUENCE TYPE: Amino Acid 
SEQUENCE LENGTH: 401 Amino Acids 

ORIGINAL SOURCE: - 
ORGANISM: Human 

IMMEDIATE EXPERIMENTAL SOURCE: Embryo Tissue 
PROPERTIES: Human TGF-£-like Protein (MP- 52) 



PGGPEPKPGH PPQTRQATAR TVTPKGQLPG GKAPPKAGSV PSSFLLKKAR EPGPPREPKE 60 

PFRPPPITPH EYMLSLYRTL SDADRKGGNS SVKLEAGLAN TITSFIDKGQ DDRGPWRKQ 12 0 

RYVFDISALE KDGLLGAELR ILRKKPSDTA KPAAPGGGRA AQLKLSSCPS GRQPASLLDV 18 0 

RSVPGLDGSG WEVFDIWKLF RNFKNSAQLC LELEAWERGR AVDLRGLGFD RAARQVHEKA 240 

LFLVFGRTKK RDLFFNEIKA RSGQDDKTVY EYLFSQRRKR RAPLATRQGK RPSKNLKARC 300 

SRKALHVNFK DMGWDDWI I A PLEYEAFHCE GLCEFPLRSH LEPTNHAVIQ TLMNSMDPES 360 

TPPTCCVPTR LSPISILFID SANNWYKQY EDMWESCGC R 4 01 



SEQ ID NO. 4 

SEQUENCE TYPE: Amino Acid 
SEQUENCE LENGTH: 352 Amino Acids 

ORIGINAL SOURCE: - 
ORGANISM: Human 

PROPERTIES: Human TGF-£-like Protein (MP- 121) 

MTSSLLLAFL LLAPTTVATP RAGGQCPACG GPTLELESQR ELLLDLAKRS ILDKLHLTQR 6 0 

PTLNRPVSRA ALRTALQHLH GVPQGALLED NREQECEIIS FAETGLSTIN QTRLDFHFSS 120 

DRTAGDREVQ QASLMFFVQL PSNTTWTLKV RVLVLG PHNT NLTLATQYLL EVDASGWHQL 180 

PLGPEAQAAC SQGHLTLELV LEGQVAQSSV ILGGAAHRPF VAARVRVGGK HQIHRRGIDC 240 

QGGSRMCCRQ EFFVDFREIG WHDWIIQPEG YAMNFCIGQC PLHIAGMPGI AAS FHTAVLN 3 00 

LLKANTAAGT TGGGSCCVPT ARRPLSLLYY DRDSNIVKTD I PDMWEACG CS 352 



SEQ ID NO. 5 

SEQUENCE TYPE: Nucleic Acid 
SEQUENCE LENGTH: 265 Base Pairs 

STRANDEDNESS : Double or Single 
TOPOLOGY: Linear 

MOLECULAR TYPE: cDNA from mRNA 

ORIGINAL SOURCE: - 
ORGANISM: Human 

IMMEDIATE EXPERIMENTAL SOURCE: Liver Tissue 

PROPERTIES: Sequence coding for a Part of the Mature Human TGF-£-like Protein 



(MP-121) 



CATCCAGCCT GAGGGCTACG CCATGAACTT CTGCATAGGG CAGTGCCCAC TACACATAGC 



60 



AGGCATGCCT GGTATTGCTG CCTCCTTTCA CACTGCAGTG CTCAATCTTC TCAAGGCCAA 



120 



CACAGCTGCA GGCACCACTG GAGGGGGCTC ATGCTGTGTA CCCACGGCCC GGCGCCCCCT 



180 



GTCTCTGCTC TATTATGACA GGGACAGCAA CATTGTCAAG ACTGACATAC CTGACATGGT 



240 



AGTAGAGGCC TGTGGGTGCA GTTAG 



265 



SEQ ID NO. 6 



SEQUENCE TYPE: Nucleic Acid 
SEQUENCE LENGTH: 13 9 Base Pairs 

STRANDEDNESS : Double or Single 
TOPOLOGY: Linear 

MOLECULAR TYPE: cDNA from mRNA 

ORIGINAL SOURCE: - 
ORGANISM: Human 

IMMEDIATE EXPERIMENTAL SOURCE : Embryo Tissue 

PROPERTIES: Sequence Coding for a Part of the Mature Human TGF-£-like Protein 
(MP-52) 



CATCGCACCC CTTGAGTACG AGGCTTTCCA CTGCGAGGGG CTGTGCGAGT TCCCATTGCG 
CTCCCACCTG GAGCCCACGA ATCATGCAGT CATCCAGACC CTGATGAACT CCATGGACCC 
CGAGTCCACA CCACCCACC 



60 
120 
139 



Figure la 



10 20 30 40 50 

MP 52 CSRKAmVNF KEMGWDEWII APLEYEAFHC EGLCEFPIoRS HLEPTISIHAVI 

BMP 2 CKRHPLYVDF SDVGWNTMIV APPGYHAFYC HGECPFPLAD HLNSTNHAIV 

BMP 4 CRRHSLYVDF SDVGWNDWIV APPGYQAFYC HGDCPFPLAD HLNSTNHAIV 

BMP 5 CKKHELYVSF RDLGWQEWII APEGYAAFYC D3ECSFPLNA ]M^TNKAIV 

BMP 6 CRKHELYVSF QDLGWQEWII APKGYAANYC DGEGSFPLNA HMNATKIHAIV 

BMP 7 CKKHELYVSF RDLG^I^II APEGYAAYYC BGEGAFPLNS YM^AJNHAIV 
* 4. * * * * ** *** + ** * * + * + * * **★ + ++ **** 

60 70 80 90 100 

MP 52 QTLMNSMDPE STPPTCCVPT RLSPISILFI DSANNWYKQ YEDMWES03 CR 

BMP 2 QTLmSVNS- KIPKAGCVPT ELSAISMLYL DENEKVVLKN YQEMWEGCG CR 

BMP 4 QTLVNSVNS- SIPKACCVPT ELSAISMLYL DEYDKWLKN YQEMWBGQ3 CR 

BMP 5 QIljVHLMFPD HVPKPCCAPT KLNMSVLYF DDSSNVILKK YRNMWRSOG CH 

BMP S QTl.VHLMNPE YVPKPCCAPT KLNAISVLYF DDNSNVILKK YRNMWRACG CH 

BMP 7 QTLVHFINPE TVPKPCCAFT QLNAISVLYF DDSSNVIL±KK YRNMWRA03 CH 

+-4.4. 4.+ 4- * **4** *4 * * -f*4- * * 4.***^4** *4 



MP 121 



Figur 1b 

10 20 30 40 

CCRQEFFVDF REIGWHDWII QPEGYAMNFC IGQCPLHIAG 



InhibpA CCKKQFFVSF KDIGWNDWII APSGYHANYC EGECPSHIAG 
InhibpB CCRQQFFIDF RLIGWNDWII APTGYYGNYC EGSCPAYLAG 



Inliiba 



CHRVALNISF QELGWERWIV YPPSFIFHYC HGGCGLHIP- 



+ + + + + + + * + + + * * + 



* * 



+ + + + + + 



50 60 70 80 

M jl 121 MPGIAASFHT AVLNLLKANT AAGTTGGGSC C VPTARRP 

3&ibpA TSGSSLSFHS TVINHYRMRG HSPFANLKSC C--VPTKLRP 

3p5iibpB VPGSASSFHT AVVNQYRMRG LNP-GTVNSC C--IPTKLST 

I !g hiba PNLSLPV PGAPPTPAQP YSLLPGAQPC CAALPGTMRP 

+ + + * + + + + + + * * +*+ + + 



4* 90 ioo no 

Mj§121 LSLLYYDRDS NIVKTD-IPD MVVEACGCS 

InhibPA MSWLYYDDGQ NIIKKD-IQN HIVEECGCS 

InhibPB MSMLYFDDEY NIVKRD-VPN MIVEECGCA 

Inhiba LHVRTTSDGG YSFKYETVPN LLTQHCACI 

+ + + + + + + + + * + ++ + ++ * + * + 



Figure 2a 



Ego RI Nco I 

OD ATGAATIOX&ra^^ 

BMP 2 AOjn3QaCT^AAIG^CTO^ 

BMP 3 " ATATIG3Cni3GA^ 

BMP 4 ATUiraSGCimAATC^^ 

BMP 7 ACXTIGGGCT3GCZAGGACTQGAT 

1GF-&L AGC^CXTIXIGG^ 

TCF-S2 GOGATCm53CTGGAAA^ 

T3F-E3 AGGRICIGQGCIQGftA^ 

inhibin a A3CT3GtXTQQG^^ 

inhibixi A(^TCX3GCIX3GAAT^ 

inhibin 1^ TCATOGGCTCGAA^ 



Figure 2b 



Eco RI 

OID ATGAATin^AGCIGOGriSa 

BMP 2 GAGTICIUIOSG^ 

BMP 3 CATCITITCTGGTACAGA^ 

BMP 4 CACTTCACTGG^^ 

BMP 7 GAGC^XCTT^^^ 

TCF-Sl CAG03CCT3a3GC^^ 

TCF-S2 TAAATCHTQGG&C^^ 

TCF-JS3 CAGGrCCTGGGGCAaXAGCA, 

inhibin a CCCIX3GGAGAG^ 

inhibin CAGCTTOijrGGGCACACAGCA 

inhibin Eg C^O^ITCCTm^ 
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(B) TYPE: nucleic acid 
n (C) STRANDEDNESS: both 
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CL (ii) MOLECULE TYPE: DNA or cDNA from mRNA 
01 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

M3CGGGCGGC CCTGAACCCA AGCCAGGACA CCCTCCCCAA ACAAGGCAGG CTACAGCCCG 60 
(SCTGTGACC CCAAAAGGAC AGCTTCCCGG AGGCAAGGCA CCCCCAAAAG CAGGATCTGT 120 
GQCCAGCTCC TTCCTGCTGA AGAAGGCCAG GGAGCCCGGG CCCCCACGAG AGCCCAAGGA 180 
GCCGTTTCGC CCACCCCCCA TCACACCCCA CGAGTACATG CTCTCGCTGT ACAGGACGCT 240 
GTCCGATGCT GACAGAAAGG GAGGCAACAG CAGCGTGAAG TTGGAGGCTG GCCTGGCCAA 300 
CACCATCACC AGCTTTATTG ACAAAGGGCA AGATGACCGA GGTCCCGTGG TCAGGAAGCA 360 
GAGGTACGTG TTTGACATTA GTGCCCTGGA GAAGGATGGG CTGCTGGGGG CCGAGCTGCG 420 
GATCTTGCGG AAGAAGCCCT CGGACACGGC CAAGCCAGCG GCCCCCGGAG GCGGGCGGGC 480 
TGCCCAGCTG AAGCTGTCCA GCTGCCCCAG CGGCCGGCAG CCGGCCTCCT TGCTGGATGT 54 0 

GCGCTCCGTG CCAGGCCTGG ACGGATCTGG CTGGGAGGTG TTCGACATCT GGAAGCTCTT 600 







• 




• 






CCGAAACTTT 


AAGAACTCGG 


CCCAGCTGTG 


CCTGGAGCTG 


GAGGCCTGGG 


AACGGGGCAG 


660 


GGCCGTGGAC 


CTCCGTGGCC 


TGGGCTTCGA 


CCGCGCCGCC 


CGGCAGGTCC 


ACGAGAAGGC 


720 


CCTGTTCCTG 


GTGTTTGGCC 


GCACCAAGAA 


ACGGGACCTG 


TTCTTTAATG 


AGATTAAGGC 


780 


CCGCTCTGGC 


CAGGACGATA 


AGACCGTGTA 


TGAGTACCTG 


TTCAGCCAGC 


GGCGAAAACG 


840 


GCGGGCCCCA 


CTGGCCACTC 


GCCAGGGCAA 


GCGACCCAGC 


AAGAACCTTA 


AGGCTCGCTG 


900 


CAGTCGGAAG 


GCACTGCATG 


TCAACTTCAA 


GGACATGGGC 


TGGGACGACT 


GGATCATCGC 


960 


ACCCCTTGAG 


TACGAGGCTT 


TCCACTGCGA 


GGGGCTGTGC 


GAGTTCCCAT 


TGCGCTCCCA 


1020 


CCTGGAGCCC 


ACGAATCATG 


CAGTCATCCA 


GACCCTGATG 


AACTCCATGG 


ACCCCGAGTC 


1080 


CACACCACCC 


ACCTGCTGTG 


TGCCCACGCG 


GCTGAGTCCC 


ATCAGCATCC 


TCTTCATTGA 


1140 


CTCTGCCAAC 


AACGTGGTGT 


ATAAGCAGTA 


TGAGGACATG 


GTCGTGGAGT 


CGTGTGGCTG 


1200 



GfiGGTAG 1ZU/ 

<ji) INFORMATION FOR SEQ ID NO: 2: 

m (i) SEQUENCE CHARACTERISTICS: 

s_ (A) LENGTH: 2272 base pairs 

O (B) TYPE: nucleic acid 

p (C) STRANDEDNESS: both 

r4 (D) TOPOLOGY: linear 

h Q (ii) MOLECULE TYPE: cDNA from mRNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CAAGGAGCCA TGCCAGCTGG ACACACACTT CTTCCAGGGC CTCTGGCAGC CAGGACAGAG 60 

TTGAGACCAC AGCTGTTGAG ACCCTGAGCC CTGAGTCTGT ATTGCTCAAG AAGGGCCTTC 120 

CCCAGCAATG ACCTCCTCAT TGCTTCTGGC CTTTCTCCTC CTGGCTCCAA CCACAGTGGC 180 

CACTCCCAGA GCTGGCGGTC AGTGTCCAGC ATGTGGGGGG CCCACCTTGG AACTGGAGAG 240 

CCAGCGGGAG CTGCTTCTTG ATCTGGCCAA GAGAAGCATC TTGGACAAGC TGCACCTCAC 300 
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• 




• 






CCAGCGCCCA 


ACACTGAACC 


GCCCTGTGTC 


CAGAGCTGCT 


TTGAGGACTG 


CACTGCAGCA 


360 


CCTCCACGGG 


GTCCCACAGG 


GGGCACTTCT 


AGAGGACAAC 


AGGGAACAGG 


AATGTGAAAT 


420 


CATCAGCTTT 


GCTGAGACAG 


GCCTCTCCAC 


CATCAACCAG 


ACTCGTCTTG 


ATTTTCACTT 


480 


CTCCTCTGAT 


AGAACTGCTG 


GTGACAGGGA 


GGTCCAGCAG 


GCCAGTCTCA 


TGTTCTTTGT 


540 


GCAGCTCCCT 


TCCAATACCA 


CTTGGACCTT 


GAAAGTGAGA 


GTCCTTGTGC 


TGGGTCCACA 


600 


TAATACCAAC 


CTCACCTTGG 


CTACTCAGTA 


CCTGCTGGAG 


GTGGATGCCA 


GTGGCTGGCA 


660 


TCAACTCCCC 


CTAGGGCCTG 


AAGCTCAAGC 


TGCCTGCAGC 


CAGGGGCACC 


TGACCCTGGA 


720 


GCTGGTACTT 


GAAGGCCAGG 


TAGCCCAGAG 


CTCAGTCATC 


CTGGGTGGAG 


CTGCCCATAG 


780 


GCCTTTTGTG 


GCAGCCCGGG 


TGAGAGTTGG 


GGGCAAACAC 


CAGATTCACC 


GACGAGGCAT 


840 


CGACTGCCAA 


GGAGGGTCCA 


GGATGTGCTG 


TCGACAAGAG 


TTTTTTGTGG 


ACTTCCGTGA 


900 


gJttggctgg cacgactgga 


TCATCCAGCC 


TGAGGGCTAC 


GCCATGAACT 


TCTGCATAGG 


960 


qgAGTGCCCA 


CTACACATAG 


CAGGCATGCC 


TGGTATTGCT 


GCCTCCTTTC 


ACACTGCAGT 


1020 


QfiTCAATCTT 


CTCAAGGCCA 


ACACAGCTGC 


AGGCACCACT 


GGAGGGGGCT 


CATGCTGTGT 


1080 


AflCCACGGCC 


CGGCGCCCCC 


TGTCTCTGCT 


CTATTATGAC 


AGGGACAGCA 


ACATTGTCAA 


1140 


QlCTGACATA 


CCTGACATGG 


TAGTAGAGGC 


CTGTGGGTGC 


AGTTAGTCTA 


TGTGTGGTAT 


1200 


GllGCAGCCCA 


AGGTTGCATG 


GGAAAACACG 


CCCCTACAGA 


AGTGCACTTC 


CTTGAGAGGA 


1260 


GSGAATGACC 


TCATTCTCTG 


TCCAGAATGT 


GGACTCCCTC 


TTCCTGAGCA 


TCTTATGGAA 


1320 


ATTACCCCAC 


CTTTGACTTG 


AAGAAACCTT 


CATCTAAAGC 


AAGTCACTGT 


GCCATCTTCC 


1380 


TGACCACTAC 


CCTCTTTCCT 


AGGGCATAGT 


CCATCCCGCT 


AGTCCATCCC 


GCTAGCCCCA 


1440 


CTCCAGGGAC 


TCAGACCCAT 


CTCCAACCAT 


GAGCAATGCC 


ATCTGGTTCC 


CAGGCAAAGA 


1500 


CACCCTTAGC 


TCACCTTTAA 


TAGACCCCAT 


AACCCACTAT 


GCCTTCCTGT 


CCTTTCTACT 


1560 


CAATGGTCCC 


CACTCCAAGA 


TGAGTTGACA 


CAACCCCTTC 


CCCCAATTTT 


TGTGGATCTC 


1620 


CAGAGAGGCC 


CTTCTTTGGA 


TTCACCAAAG 


TTTAGATCAC 


TGCTGCCCAA 


AATAGAGGCT 


1680 


TACCTACCCC 


CCTCTTTGTT 


GTGAGCCCCT 


GTCCTTCTTA 


GTTGTCCAGG 


TGAACTACTA 


1740 
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AAGCTCTCTT TGCATACCTT CATCCATTTT TTGTCCTTCT CTGCCTTTCT CTATGCCCTT 1800 

AAGGGGTGAC TTGCCTGAGC TCTATCACCT GAGCTCCCCT GCCCTCTGGC TTCCTGCTGA 18 60 

GGTCAGGGCA TTTCTTATCC CTGTTCCCTC TCTGTCTAGG TGTCATGGTT CTGTGTAACT 1920 

GTGGCTATTC TGTGTCCCTA CACTACCTGG CTACCCCCTT CCATGGCCCC AGCTCTGCCT 1980 

ACATTCTGAT TTTTTTTTTT TTTTTTTTTT TGAAAAGTTA AAAATTCCTT AATTTTTTAT 2040 

TCCTGGTACC ACTACCACAA TTTACAGGGC AATATACCTG ATGTAATGAA AAGAAAAAGA 2100 

AAAAGACAAA GCTACAACAG ATAAAAGACC TCAGGAATGT ACATCTAATT GACACTACAT 2160 

TGCATTAATC AATAGCTGCA CTTTTTGCAA ACTGTGGCTA TGACAGTCCT GAACAAGAAG 2220 

GGTTTCCTGT TTAAGCTGCA GTAACTTTTC TGACTATGGA TCATCGTTCC TT 2272 

(24 INFORMATION FOR SEQ ID NO: 3: 

:J (i) SEQUENCE CHARACTERISTICS: 

!~f (A) LENGTH: 401 amino acids 

fl (B) TYPE: amino acid 

Jt! (C) STRANDEDNESS : single 

zi (D) TOPOLOGY: linear 

J (ii) MOLECULE TYPE: protein 

fU 

J (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Pro Gly Gly Pro Glu Pro Lys Pro Gly His Pro Pro Gin Thr Arg Gin 
15 10 15 

Ala Thr Ala Arg Thr Val Thr Pro Lys Gly Gin Leu Pro Gly Gly Lys 
20 25 30 

Ala Pro Pro Lys Ala Gly Ser Val Pro Ser Ser Phe Leu Leu Lys Lys 
35 40 45 

Ala Arg Glu Pro Gly Pro Pro Arg Glu Pro Lys Glu Pro Phe Arg Pro 
50 55 60 

Pro Pro lie Thr Pro His Glu Tyr Met Leu Ser Leu Tyr Arg Thr Leu 
65 70 75 80 
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Ser Asp Ala Asp Arg Lys Gly Gly Asn Ser Ser Val Lys Leu Glu Ala 

85 90 95 

Gly Leu Ala Asn Thr lie Thr Ser Phe lie Asp Lys Gly Gin Asp Asp 
100 105 110 

Arg Gly Pro Val Val Arg Lys Gin Arg Tyr Val Phe Asp lie Ser Ala 
115 120 125 

Leu Glu Lys Asp Gly Leu Leu Gly Ala Glu Leu Arg lie Leu Arg Lys 
130 135 140 

Lys Pro Ser Asp Thr Ala Lys Pro Ala Ala Pro Gly Gly Gly Arg Ala 
145 150 155 160 

Ala Gin Leu Lys Leu Ser Ser Cys Pro Ser Gly Arg Gin Pro Ala Ser 

165 170 175 

Leu Leu Asp Val Arg Ser Val Pro Gly Leu Asp Gly Ser Gly Trp Glu 
180 185 190 

Val Phe Asp lie Trp Lys Leu Phe Arg Asn Phe Lys Asn Ser Ala Gin 

195 200 205 

Leu Cys Leu Glu Leu Glu Ala Trp Glu Arg Gly Arg Ala Val Asp Leu 
210 215 220 

Arg Gly Leu Gly Phe Asp Arg Ala Ala Arg Gin Val His Glu Lys Ala 
225 230 235 240 

Leu Phe Leu Val Phe Gly Arg Thr Lys Lys Arg Asp Leu Phe Phe Asn 

245 250 255 

Glu lie Lys Ala Arg Ser Gly Gin Asp Asp Lys Thr Val Tyr Glu Tyr 
260 265 270 

Leu Phe Ser Gin Arg Arg Lys Arg Arg Ala Pro Leu Ala Thr Arg Gin 
275 280 285 



Gly Lys Arg Pro Ser Lys Asn Leu Lys Ala Arg Cys Ser Arg Lys Ala 
290 295 300 

Leu His Val Asn Phe Lys Asp Met Gly Trp Asp Asp Trp lie lie Ala 
305 310 315 320 

Pro Leu Glu Tyr Glu Ala Phe His Cys Glu Gly Leu Cys Glu Phe Pro 

325 330 335 
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Leu Arg Ser His Leu Glu Pro Thr Asn His Ala Val lie Gin Thr Leu 
340 345 350 

Met Asn Ser Met Asp Pro Glu Ser Thr Pro Pro Thr Cys Cys Val Pro 
355 360 365 

Thr Arg Leu Ser Pro lie Ser lie Leu Phe lie Asp Ser Ala Asn Asn 
370 375 380 



Val Val Tyr Lys Gin Tyr Glu Asp Met Val Val Glu Ser Cys Gly Cys 
385 390 395 400 



Arg 



) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Thr Ser Ser Leu Leu Leu Ala Phe Leu Leu Leu Ala Pro Thr Thr 
15 10 15 

Val Ala Thr Pro Arg Ala Gly Gly Gin Cys Pro Ala Cys Gly Gly Pro 
20 25 30 

Thr Leu Glu Leu Glu Ser Gin Arg Glu Leu Leu Leu Asp Leu Ala Lys 
35 40 45 

Arg Ser lie Leu Asp Lys Leu His Leu Thr Gin Arg Pro Thr Leu Asn 
50 55 60 

Arg Pro Val Ser Arg Ala Ala Leu Arg Thr Ala Leu Gin His Leu His 
65 70 75 80 

Gly Val Pro Gin Gly Ala Leu Leu Glu Asp Asn Arg Glu Gin Glu Cys 

85 90 95 
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Glu He He Ser Phe Ala Glu Thr Gly Leu Ser Thr He Asn Gin Thr 
100 105 110 

Arg Leu Asp Phe His Phe Ser Ser Asp Arg Thr Ala Gly Asp Arg Glu 
115 120 125 

Val Gin Gin Ala Ser Leu Met Phe Phe Val Gin Leu Pro Ser Asn Thr 
130 . 135 140 

Thr Trp Thr Leu Lys Val Arg Val Leu Val Leu Gly Pro His Asn Thr 
145 150 155 160 

Asn Leu Thr Leu Ala Thr Gin Tyr Leu Leu Glu Val Asp Ala Ser Gly 

165 170 175 

Trp His Gin Leu Pro Leu Gly Pro Glu Ala Gin Ala Ala Cys Ser Gin 
180 185 190 

Gly His Leu Thr Leu Glu Leu Val Leu Glu Gly Gin Val Ala Gin Ser 
195 200 205 

Ser Val He Leu Gly Gly Ala Ala His Arg Pro Phe Val Ala Ala Arg 
210 215 220 

Val Arg Val Gly Gly Lys His Gin He His Arg Arg Gly He Asp Cys 
225 230 235 240 

Gin Gly Gly Ser Arg Met Cys Cys Arg Gin Glu Phe Phe Val Asp Phe 

245 250 255 

Arg Glu He Gly Trp His Asp Trp He He Gin Pro Glu Gly Tyr Ala 
260 265 270 

Met Asn Phe Cys He Gly Gin Cys Pro Leu His He Ala Gly Met Pro 
275 280 285 

Gly He Ala Ala Ser Phe His Thr Ala Val Leu Asn Leu Leu Lys Ala 
290 295 300 

Asn Thr Ala Ala Gly Thr Thr Gly Gly Gly Ser Cys Cys Val Pro Thr 
305 310 315 320 

Ala Arg Arg Pro Leu Ser Leu Leu Tyr Tyr Asp Arg Asp Ser Asn He 

325 330 335 

Val Lys Thr Asp He Pro Asp Met Val Val Glu Ala Cys Gly Cys Ser 
340 345 350 



34 



" (2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 265 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA from mRNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

CATCCAGCCT GAGGGCTACG CCATGAACTT CTGCATAGGG CAGTGCCCAC TACACATAGC 60 

AGGCATGCCT GGTATTGCTG CCTCCTTTCA CACTGCAGTG CTCAATCTTC TCAAGGCCAA 120 

djbAGCTGCA GGCACCACTG GAGGGGGCTC ATGCTGTGTA CCCACGGCCC GGCGCCCCCT 180 

GjfCTCTGCTC TATTATGACA GGGACAGCAA CATTGTCAAG ACTGACATAC CTGACATGGT 240 

AgTAGAGGCC TGTGGGTGCA GTTAG 265 

f#) INFORMATION FOR SEQ ID NO: 6: 

y (i) SEQUENCE CHARACTERISTICS: 

! Jf (A) LENGTH: 139 base pairs 

% (B) TYPE: nucleic acid 

1 (C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA from mRNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CATCGCACCC CTTGAGTACG AGGCTTTCCA CTGCGAGGGG CTGTGCGAGT TCCCATTGCG 60 
CTCCCACCTG GAGCCCACGA ATCATGCAGT CATCCAGACC CTGATGAACT CCATGGACCC 120 
CGAGTCCACA CCACCCACC 139 
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(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
ATGAACTCCA TGGACCCCGA GTCCACA 

(2) INFORMATION FOR SEQ ID NO: 8: 

O (i) SEQUENCE CHARACTERISTICS: 

"0 (A) LENGTH: 30 base pairs 

0 (B) TYPE: nucleic acid 

y= (C) STRANDEDNESS: both 

Jjj (D) TOPOLOGY: linear 

^ (ii) MOLECULE TYPE: DNA 

^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 

6f TCTCAAGG CCAACACAGC TGCAGGCACC 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 

Met Asn Ser Met Asp Pro Glu Ser Thr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

£ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ff Leu Leu Lys Ala Asn Thr Ala Ala Gly Thr 
= f 1 5 10 

SvS 7. 

(2) INFORMATION FOR SEQ ID NO: 11: 

% (i) SEQUENCE CHARACTERISTICS: 

tj (A) LENGTH: 46 base pairs 

'% (B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AGAATTCGCA TGCCATGGTC GACGAAGCTT TTTTTTTTTT TTTTTT 

(2) INFORMATION FOR SEQ ID NO: 12: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 
AGAATTCGCA TGCCATGGTC GACG 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 

¥ (A) LENGTH: 24 base pairs 

^ (B) TYPE: nucleic acid 

U (C) STRANDEDNESS: both 

2 (D) TOPOLOGY: linear 

m (ii) MOLECULE TYPE: DNA 

2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 
GjfCTACGCCA TGAACTTCTG CATA 

(2) INFORMATION FOR SEQ ID NO: 14: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 
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ACATAGCAGG CATGCCTGGT ATTG 
(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 



CTTGAGTACG AGGCTTTCCA CTG 



60) INFORMATION FOR SEQ ID NO: 16: 

□ (i) SEQUENCE CHARACTERISTICS: 

! : 2 (A) LENGTH: 24 base pairs 

if (B) TYPE: nucleic acid 

yi (C) STRANDEDNESS: both 

L (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 



ATTCGCATGC CATGGTCGAC GAAG 



(2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17 



GGAGCCCACG AATCATGCAG TCA 



(2) INFORMATION FOR SEQ ID NO: 18: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18 

sjiAGCAGGTG GGTGGTGTGG ACT 

(f) INFORMATION FOR SEQ ID NO: 19: 

rf (i) SEQUENCE CHARACTERISTICS: 

H; (A) LENGTH: 20 base pairs 

'f-1 (B) TYPE: nucleic acid 

|f (C) STRANDEDNESS: both 

f : (D) TOPOLOGY: linear 

J (ii) MOLECULE TYPE: DNA 

jS (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19 
cJagcagccc ATCCTTCTCC 
(2) INFORMATION FOR SEQ ID NO: 20: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20 



TCCAGGGCAC TAATGTCAAA CACG 



(2) INFORMATION FOR SEQ ID NO: 21: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
ACTAATGTCA AACACGTACC TCTG 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 amino acids 
_ (B) TYPE: amino acid 

D (C) STRANDEDNESS: 

* (D) TOPOLOGY: linear 

JH- (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Fj Cys Ser Arg Lys Ala Leu His Val Asn Phe Lys Asp Met Gly Trp Asp 
£ 1 5 10 15 

lg Asp Trp He He Ala Pro Leu Glu Tyr Glu Ala Phe His Cys Glu Gly 

h 20 25 30 

Leu Cys Glu Phe Pro Leu Arg Ser His Leu Glu Pro Thr Asn His Ala 
35 40 45 

Val He Gin Thr Leu Met Asn Ser Met Asp Pro Glu Ser Thr Pro Pro 
50 55 60 

Thr Cys Cys Val Pro Thr Arg Leu Ser Pro He Ser He Leu Phe He 
65 70 75 80 

Asp Ser Ala Asn Asn Val Val Tyr Lys Gin Tyr Glu Asp Met Val Val 

85 90 95 

Glu Ser Cys Gly Cys Arg 
100 
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(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 101 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Cys Lys Arg His Pro Leu Tyr Val Asp Phe Ser Asp Val Gly Trp Asn 
15 10 15 

Asp Trp He Val Ala Pro Pro Gly Tyr His Ala Phe Tyr Cys His Gly 
20 25 30 

Glu Cys Pro Phe Pro Leu Ala Asp His Leu Asn Ser Thr Asn His Ala 
% 35 40 4 5 

~ He Val Gin Thr Leu Val Asn Ser Val Asn Ser Lys He Pro Lys Ala 
U 50 55 60 

±. Cys Cys Val Pro Thr Glu Leu Ser Ala He Ser Met Leu Tyr Leu Asp 

m 65 70 75 80 

JU Glu Asn Glu Lys Val Val Leu Lys Asn Tyr Gin Asp Met Val Val Glu 
3 85 90 95 

^ Gly Cys Gly Cys Arg 
5 100 

f2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 101 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Cys Arg Arg His Ser Leu Tyr Val Asp Phe Ser Asp Val Gly Trp Asn 
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15 10 15 

Asp Trp He Val Ala Pro Pro Gly Tyr Gin Ala Phe Tyr Cys His Gly 

20 25 30 

Asp Cys Pro Phe Pro Leu Ala Asp His Leu Asn Ser Thr Asn His Ala 
35 40 45 

He Val Gin Thr Leu Val Asn Ser Val Asn Ser Ser He Pro Lys Ala 
50 55 60 

Cys Cys Val Pro Thr Glu Leu Ser Ala He Ser Met Leu Tyr Leu Asp 
65 70 75 80 

Glu Tyr Asp Lys Val Val Leu Lys Asn Tyr Gin Glu Met Val Val Glu 

85 90 95 

Gly Cys Gly Cys Arg 
100 

INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Cys Lys Lys His Glu Leu Tyr Val Ser Phe Arg Asp Leu Gly Trp Gin 
15 10 15 

Asp Trp He He Ala Pro Glu Gly Tyr Ala Ala Phe Tyr Cys Asp Gly 
20 25 30 

Glu Cys Ser Phe Pro Leu Asn Ala His Met Asn Ala Thr Asn His Ala 
35 40 45 

He Val Gin Thr Leu Val His Leu Met Phe Pro Asp His Val Pro Lys 
50 55 60 

Pro Cys Cys Ala Pro Thr Lys Leu Asn Ala He Ser Val Leu Tyr Phe 
65 70 75 80 
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Asp Asp Ser Ser Asn Val lie Leu Lys Lys Tyr Arg Asn Met Val Val 

85 90 95 

Arg Ser Cys Gly Cys His 
100 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 



Cys Arg Lys His 

y 1 

g Asp Trp He He 

U 20 

m Glu Cys Ser Phe 

m 35 

D He Val Gin Thr 

S 50 



JE Pro Cys Cys Ala 

h 65 

Asp Asp Asn Ser 



Glu Leu Tyr Val 

5 

Ala Pro Lys Gly 



Pro Leu Asn Ala 

40 

Leu Val His Leu 

55 

Pro Thr Lys Leu 
70 

Asn Val He Leu 

85 



Ser Phe Gin Asp 
10 



Tyr Ala Ala Asn 

25 

His Met Asn Ala 



Met Asn Pro Glu 

60 

Asn Ala He Ser 
75 

Lys Lys Tyr Arg 

90 



Leu Gly Trp Gin 

15 

Tyr Cys Asp Gly 

30 

Thr Asn His Ala 

45 

Tyr Val Pro Lys 



Val Leu Tyr Phe 
80 

Asn Met Val Val 
95 



Arg Ala Cys Gly Cys His 

100 



(2) INFORMATION FOR SEQ ID NO: 27: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 102 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

Cys Lys Lys His Glu Leu Tyr Val Ser Phe Arg Asp Leu Gly Trp Gin 
15 10 15 

Asp Trp He He Ala Pro Glu Gly Tyr Ala Ala Tyr Tyr Cys Glu Gly 
20 25 30 

Glu Cys Ala Phe Pro Leu Asn Ser Tyr Met Asn Ala Thr Asn His Ala 
35 40 45 

He Val Gin Thr Leu Val His Phe He Asn Pro Glu Thr Val Pro Lys 
50 55 60 

Pro Cys Cys Ala Pro Thr Gin Leu Asn Ala He Ser Val Leu Tyr Phe 
65 70 75 80 

= Asp Asp Ser Ser Asn Val He Leu Lys Lys Tyr Arg Asn Met Val Val 
S 85 90 95 

fr Arg Ala Cys Gly Cys His 

(S) INFORMATION FOR SEQ ID NO: 28: 

O (i) SEQUENCE CHARACTERISTICS: 

J (A) LENGTH: 106 amino acids 

m (B) TYPE: amino acid 

IE (C) STRANDEDNESS: 

■h (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Cys Cys Arg Gin Glu Phe Phe Val Asp Phe Arg Glu He Gly Trp His 
15 10 15 

Asp Trp He He Gin Pro Glu Gly Tyr Ala Met Asn Phe Cys He Gly 

20 25 30 

Gin Cys Pro Leu His He Ala Gly Met Pro Gly He Ala Ala Ser Phe 
35 40 45 
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His Thr Ala Val Leu Asn Leu Leu Lys Ala Asn Thr Ala Ala Gly Thr 
50 55 60 

Thr Gly Gly Gly Ser Cys Cys Val Pro Thr Ala Arg Arg Pro Leu Ser 
65 70 75 80 

Leu Leu Tyr Tyr Asp Arg Asp Ser Asn lie Val Lys Thr Asp lie Pro 

85 90 95 

Asp Met Val Val Glu Ala Cys Gly Cys Ser 
100 105 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



!=f (xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

fl Cys Cys Lys Lys Gin Phe Phe Val Ser Phe Lys Asp lie Gly Trp Asn 
1 5 10 15 

L Asp Trp He He Ala Pro Ser Gly Tyr His Ala Asn Tyr Cys Glu Gly 

» 20 25 30 

% Glu Cys Pro Ser His He Ala Gly Thr Ser Gly Ser Ser Leu Ser Phe 
S 35 40 45 

His Ser Thr Val He Asn His Tyr Arg Met Arg Gly His Ser Pro Phe 

50 55 60 

Ala Asn Leu Lys Ser Cys Cys Val Pro Thr Lys Leu Arg Pro Met Ser 
65 70 75 80 

Met Leu Tyr Tyr Asp Asp Gly Gin Asn He He Lys Lys Asp He Gin 

85 90 95 

Asn Met He Val Glu Glu Cys Gly Cys Ser 
100 105 

(2) INFORMATION FOR SEQ ID NO: 30: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Cys Cys Arg Gin Gin Phe Phe lie Asp Phe Arg Leu He Gly Trp Asn 
15 10 15 

Asp Trp He He Ala Pro Thr Gly Tyr Tyr Gly Asn Tyr Cys Glu Gly 
20 25 30 

Ser Cys Pro Ala Tyr Leu Ala Gly Val Pro Gly Ser Ala Ser Ser Phe 
35 40 45 

His Thr Ala Val Val Asn Gin Tyr Arg Met Arg Gly Leu Asn Pro Gly 
50 55 60 

Thr Val Asn Ser Cys Cys He Pro Thr Lys Leu Ser Thr Met Ser Met 
65 70 75 80 

Leu Tyr Phe Asp Asp Glu Tyr Asn He Val Lys Arg Asp Val Pro Asn 

85 90 95 

Met He Val Glu Glu Cys Gly Cys Ala 

100 105 

) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Cys His Arg Val Ala Leu Asn He Ser Phe Gin Glu Leu Gly Trp Glu 
15 10 15 
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Arg Trp lie Val Tyr Pro Pro Ser Phe lie Phe His Tyr Cys His Gly 
20 25 30 

Gly Cys Gly Leu His lie Pro Pro Asn Leu Ser Leu Pro Val Pro Gly 

35 40 45 

Ala Pro Pro Thr Pro Ala Gin Pro Tyr Ser Leu Leu Pro Gly Ala Gin 
50 55 60 

Pro Cys Cys Ala Ala Leu Pro Gly Thr Met Arg Pro Leu His Val Arg 
65 70 75 80 

Thr Thr Ser Asp Gly Gly Tyr Ser Phe Lys Tyr Glu Thr Val Pro Asn 

85 90 95 

Leu Leu Thr Gin His Cys Ala Cys lie 
100 105 

(2) INFORMATION FOR SEQ ID NO: 32: 

"=i (i) SEQUENCE CHARACTERISTICS: 
^ (A) LENGTH: 36 base pairs 

2? (B) TYPE: nucleic acid 

J: (C) STRANDEDNESS: single 

\Z (D) TOPOLOGY: linear 

J{ (ii) MOLECULE TYPE: DNA 

3 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

aJgaattccc ATGGACCTGG GCTGGMAKGA MTGGAT 36 
(J) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
ACGTGGGGTG GAATGACTGG AT 22 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34 
ATATTGGCTG GAGTGAATGG AT 
(2) INFORMATION FOR SEQ ID NO: 35: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

i~ (ii) MOLECULE TYPE: DNA 

J (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35 
AgGTGGGCTG GAATGACTGG AT 



G£) INFORMATION FOR SEQ ID NO: 36: 

=0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36 
ACCTGGGCTG GCAGGACTGG AT 
(2) INFORMATION FOR SEQ ID NO: 37: 




(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37 
AGGACCTCGG CTGGAAGTGG AT 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

^ (ii) MOLECULE TYPE: DNA 



jjj (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38 

GgGATCTAGG GTGGAAATGG AT 

(§) INFORMATION FOR SEQ ID NO: 39: 

m (i) SEQUENCE CHARACTERISTICS: 
J: (A) LENGTH: 22 base pairs 

yg (B) TYPE: nucleic acid 

m (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39 
AGGATCTGGG CTGGAAGTGG GT 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40 
AGCTGGGCTG GGAACGGTGG AT 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



J (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41 

AjATCGGCTG GAATGACTGG AT 

(j|) INFORMATION FOR SEQ ID NO: 42: 

T (i) SEQUENCE CHARACTERISTICS: 

Q (A) LENGTH: 22 base pairs 

J| (B) TYPE: nucleic acid 

nj (C) STRANDEDNESS: single 

=p (D) TOPOLOGY: linear 

C 5 (ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42 
TCATCGGCTG GAACGACTGG AT 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43 
ATGAATTCGA GCTGCGTSGG SRCACAGCA 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
GSGTTCTGTC GGGACACAGC A 
f|) INFORMATION FOR SEQ ID NO: 45: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



4* (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
CATCTTTTCT GGTACACAGC A 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 6 
CAGTTCAGTG GGCACACAAC A 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47 
GAGCTGCGTG GGCGCACAGC A 
(2.) INFORMATION FOR SEQ ID NO: 48: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



m (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48 

cJgcgcctgc GGCACGCAGC a 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 base pairs 
. (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 
TAAATCTTGG GACACGCAGC A 
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(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 
CAGGTCCTGG GGCACGCAGC A 
(2) INFORMATION FOR SEQ ID NO: 51: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

h; (ii) MOLECULE TYPE: DNA 



Jj (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51 
QSCTGGGAGA GCAGCACAGC A 
(jj) INFORMATION FOR SEQ ID NO: 52: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52 
CAGCTTGGTG GGCACACAGC A 
(2) INFORMATION FOR SEQ ID NO: 53: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53 
CAGCTTGGTG GGAATGCAGC A 
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